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Abstract 

A sender holds a word x consisting of n blocks Xi, each of t bits, and wishes to broadcast 
a codeword to m receivers, Each receiver Ri is interested in one block, and has 

prior side information consisting of some subset of the other blocks. Let Pt be the minimum 
number of bits that has to be transmitted when each block is of length t, and let (3 be the limit 
P = limf^oo Pt/t- In words, P is the average communication cost per bit in each block (for long 
blocks). Finding the coding rate p, for such an informed broadcast setting, generalizes several 
coding theoretic parameters related to Informed Source Coding on Demand, Index Coding and 
Network Coding. 

In this work we show that usage of large data blocks may strictly improve upon the trivial 
encoding which treats each bit in the block independently. To this end, we provide general 
bounds on Pt , and prove that for any constant C there is an explicit broadcast setting in which 
P = 2 but Pi > C. One of these examples answers a question of [15]. 

In addition, we provide examples with the following counterintuitive direct-sum phenomena. 
Consider a union of several mutually independent broadcast settings. The optimal code for the 
combined setting may yield a significant saving in communication over concatenating optimal 
encodings for the individual settings. This result also provides new non- linear coding schemes 
which improve upon the largest known gap between linear and non-linear Network Coding, thus 
improving the results of [8]. 

The proofs are based on a relation between this problem and results in the study of Witsen- 
hausen's rate, OR graph products, colorings of Cayley graphs, and the chromatic numbers of 
Kneser graphs. 
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1 Introduction 



Source coding deals with a scenario in which a sender has some data string x he wishes to transmit 
through a broadcast channel to receivers. In this paper we consider a variant of source coding 
which was first proposed by Birk and Kol [6]. In this variant, called Informed Source Coding On 
Demand (ISCOD), each receiver has some prior side information, comprising a part of the input 
word X. The sender is aware of the portion of x known to each receiver. Moreover, each receiver is 
interested in just part of the data. 

We formalize this source coding setting as follows. Suppose that a sender S wishes to broadcast 
a word x = X1X2 ■ ■ ■ Xn, where Xi G {0, 1}* for all i, to m receivers . . . , Rm- Each Rj has some 
prior side information, consisting of some of the blocks x^, and is interested in a single block 
The sender wishes to transmit a codeword that will enable each and every receiver Rj to reconstruct 
its missing block xj^j) from its prior information. Let f3t denote the minimum possible length of 
such a binary code. Our objective in this paper is to study the possible behaviors of (5t, focusing 
on the more natural scenario of transmitting large data blocks (namely a large t). 

The motivation for informed source coding is in applications such as Video on Demand. In such 
applications, a network, or a satellite, has to broadcast information to a set of clients. During the 
first transmission, each receiver misses a part of the data. Hence, each client is now interested in 
a different (small) part of the data, and has a prior side information, consisting of the part of the 
data he received [21]. Note that our assumption that each receiver is interested only in a single 
block is not necessary. To see this, one can simulate a receiver interested in r blocks by r receivers, 
each interested in one of these blocks, and all having the same side information. 

The problem above generalizes the problem of Index Coding, which was first presented by Birk 
and Kol [6], and later studied by Bar-Yossef, Birk, Jayram and Kol [4] and by Lubeztky and Stav 
[15]. Index Coding is equivalent to a special case of our problem, in which m = n, f{j) = j for all 
j G [m] = { 1 , . . . , m} and the size of the data blocks is t = 1 . Our work can also be considered 
in the context of Network Coding, a term which was coined by Ahlswede, Cai, Li, and Yeung [3]. 
In a Network Coding problem it is asked whether a given communication network (with limited 
capacities on each link) can meet its requirement, passing a certain amount of information from a 
set of source vertices to a set of targets. 

It will be easier to describe our source coding problems in terms of a certain hypergraph. We 
define a directed hypergraph H = {V,E) on the set of vertices V = [n]. Each vertex i oi H 
corresponds to an input block Xj. The set E oi m edges corresponds to the receivers . . . ,Rm- 
For the receiver iij, E contains a directed edge Cj = {f{j),N{j)), where N{j) C [n] denotes the 
set of blocks which are known to receiver Rj . Clearly the structure of H captures the definition of 
the broadcast setting. We thus denote by f3t{H) the minimal number of bits required to broadcast 
the information to all the receivers when the block length is t. 

Let H be such a directed hypergraph. For any pair of integers ti and t2, when the block length 
is ti + t2, it is possible to encode the first ti bits, then separately encode the remaining t2 bits. 
By concatenating these two codes we get (3ti+t2{H) 1^ Pti{H) + Pt2{H), i.e. (3t{H) is sub-additive. 
Fekete's Lemma thus implies that the limit liuit^oo PtiEl) /t exists and equals va.it l3t{H) /t. We 
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define P{H) to be this limit: 



t^oo t t t 

In words, f3 is the average asymptotic number of encoding bits needed per bit in each input block. 

To study this problem, we will also consider the following related one. Let k ■ H denote the 
disjoint union of k copies of H. Define f3t{H) := j3i{t ■ H). In words, f5f represents the minimal 
number of bits required if the network topology is replicated t independent times^. A similar 
sub-additivity argument justifies the definition of the limit 

t^OO t t t 

By viewing each receiver in the broadcast network as t receivers, each interested in a single bit, 
we can compare this scenario with the setting of independent copies. Clearly, the receivers in the 
first scenario have additional information and hence (3t{H) < Pti-^) ^- Taking limits we 

get P{H) < (3*{H). 

There are several lower bounds for f3{H). One such simple bound, which we denote by a{H), 
is the maximal size of a set S of vertices satisfying the following: For every v £ S there exists some 
e = {v, J) £ E so that J H S" = 0. A simple counting argument shows that a{H) < P{H), giving^ 

d{H)<f3{H)<(3*iH)<(3,iH) . (1) 



1.1 Our Results 

Let H = {[n],E) be a directed hypergraph for a broadcast network, and set t = 1. It will be 
convenient to address the more precise notion of the number of codewords in a broadcast code 
which satisfies H. We say that C, a broadcast code for H, is optimal, if it contains the minimum 
possible number of codewords (in which case, (3i{H) = \\0g2 |C|]). We say that two input-strings 
x,y £ {0, 1}" are confusable if there exists a receiver e = {i,J) £ E such that Xi 7^ yi but Xj = yj 
for all j G J. This implies that the input-strings x, y can not be encoded with the same codeword. 
Denoting by 7 the maximal cardinality of a set of input-strings which is pairwise unconfusable. 
The first technical result of this paper relates /?* and 7. 



Theorem 1.1. Let H and 7 be defined as above. The following holds for any integer k: 

(2) 



k ^ / nr). \ k 



— 1 < ici < 

7 



2" 
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where C is an optimal code for k ■ H . In particular, j3*{H) = lim 1 = n — log2 7. 

fc^oo k 



^Such a scenario can occur when the topology is standard (e.g. resulting from using a common application or 
operation system). Therefore it is identical across networks, albeit with different data. 

^The bound a given here generalizes the bound given in [4] to directed hypergraphs, as well as to (3 (rather than 
just to /3i). Another bound in the Index Coding model is the MAIS (maximum acyclic induced subgraph) bound 
given in [4], that can also be generalized to our model. 
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A surprising corollary of the above theorem is that /3* may be strictly smaller than f3i. Indeed, 
as (3* deals with the case of disjoint instances, it is not intuitively clear that this should be the case: 
one would think that there can be no room for improving upon (5i{H) when replicating H into t 
disjoint copies, given the total independence between these copies (no knowledge on blocks from 
other copies, independently chosen inputs). Note that even in the somewhat related Information 
Theoretic notion of the Shannon capacity of graphs (corresponding to channel coding rather than 
source coding), though it is known that the capacity of a disjoint union may exceed the sum of the 
individual capacities (see [1]), it is easy to verify that disjoint unions of the same graph can never 
achieve this. The following theorem demonstrates the possible gap between (3i{t ■ H)/t and (3i{H) 
even in a very limited setting, which coincides with Index Coding. This solves the open problem 
presented by [15] already for the smallest possible n = 5. 

Theorem 1.2. Define a broadcast network H = (Z5, E) based on the odd cycle C^: For each i G Z5, 
there is a directed edge {i,{i — l,i + 1}), where the arithmetic is modulo 5. Then Pi{H) = 3, 
whereas [3*{H) = 5 - logg 5 w 2.68. 

It is worth noting that in the example above, the optimal code for H contains 8 codewords, 
whereas in the limit, each disjoint copy of H costs 6.4 codewords, hence this surprising direct-sum 
phenomenon carries beyond any integer rounding issues. In addition, in Section 3.3 (Theorem 3.15) 
we generalize the above example to any broadcast network which is based on a complement of an 
odd cycle. 

The following theorem extends the above results on the gap between (3* and f3i even further, 
by providing an example where (3* is bounded whereas (3i can be arbitrarily large: 

Theorem 1.3. There exists an explicit infinite family of broadcast networks for which f3*{H) < 3 
is bounded and yet (3i{H) is unbounded. 

Finally, recalling (1), one would expect that in many cases [3 should be strictly smaller than /?*, 
as the receivers possess more side information. However it is not clear how much can be gained by 
this additional information. We construct an example where not only is there a difference between 
the two, but [3 is constant while (3* is unbounded. 

Theorem 1.4. There exists an explicit infinite family of broadcast networks for which (3{H) = 2 
is constant whereas [3*{H) is unbounded. 

We discuss applications of the results to Network and Index coding in what follows. 
1.2 Related Work 

Our work is a generalization of Index Coding, which was first studied by Birk and Kol [6]. This 
problem deals with a sender, who wishes to send n blocks of data to n receivers, where each 
receiver knows a subset of the blocks, and is interested in a single block (different receivers are 
interested in different blocks). The sender can only utilize a broadcast channel, and we wish to 
minimize the number of bits he has to send. Birk and Kol presented a class of encodings, based on 
erasure Reed Solomon codes. They also dealt with some of the practical issues of this scheme, such 
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as synchronization between the chents and the server. Finally they gave examples for scenarios 
where their codes were not optimal, and presented the question of finding better codes. The first 
improvement to the original codes was done by Birk, Bar-Yossef, Jayram and Kol, who found a 
lower bound to the minimal length of linear codes, called the min-rank. They also conjectured that 
linear codes are optimal for index coding, a conjecture that was later refuted by Lubetzky and 
Stav. However, the proof by Lubeztky and Stav was limited, in the sense that they constructed an 
index coding problem, for which linear codes over any field were not optimal, and yet a combination 
of linear codes over several fields may well be optimal. Theorem 1.2 refutes the conjecture in a 
stronger sense, by showing an index coding problem for which the optimal solution is not linear for 
any field size or even any combination of several fields. 

Network Coding deals with a scenario in which several sources wish to pass information to 
several targets, when the communication network is modeled by a graph. Each edge has a capacity, 
and the goal is to see if the network is satisfiable, i.e. if it is possible to meet all the demands of the 
clients. This very intuitive model of communication is motivated by the Internet, where routers 
pass information from different sites to users. It has been believed that routers need only store and 
forward data (Multi Commodity Flow) , without processing it at all. This intuition was proved false 
by Ahlswede, Cai, Li, and Yeung [3], who showed a very simple network (the Butterfiy Network), 
which was only satisfiable if one of the nodes processed the data which entered it. The encoding 
done in this example was linear, and for some time it was not clear if non linearity is beneficial in 
constructing optimal network codes. The work of Dougherty, Freiling, and Zeger [8] answered this 
in the affirmative, giving an example of a network in which non linear codes are essential in order to 
achieve the required network capacity. Their construction relies on the parity of the characteristic 
of the underlying field, and gives a ratio of 1.1 between the coding capacity and the linear coding 
capacity. Another way to achieve a gap between linear and non linear codes was presented in [7]. 
Improving this ratio, as well as finding new ways to create such gaps are open problems in the field 
of Network Coding (see [20] for a survey). 

To see that our model is indeed a special case of network coding, we present the following 
simple reduction between a directed hypergraph which describes a broadcast network H = (V, E) 
to a network coding problem. We build a network of n sources si, . . . , s^, and m sinks ti, . . . ,tm- 
There are also two special vertices u and w. Letting 

Eoo = {isi,u) : i G [n]} U {(w,te) : e £ E} U {isj,te) : e = (i, J) G E,j G J} , 

the network has an edge with infinite capacity for each e G Eoo- In addition to that, the network 
has a single edge with finite capacity, (u, w). If each source receives an input of t bits, the demand 
of the network can be satisfied if and only if the capacity of (u, w) is at least [5t{H). Moreover, this 
reduction maintains linearity of codes. 

This reduction enables us to translate some of our results to the network coding model. In 
particular, we prove the following corollary of our results, improving the results of [8]: 

Corollary 1.5. There exists a network with 48 vertices such that the ratio between the coding 
capacity and the linear coding capacity in it is at least 1.324. 

The corollary is based on the results of Appendix A. 3, where we show that for a certain directed 
hypergraph, -^23) any linear code requires 3 bits, while l3*{H2z) < 2.265. 
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2 Optimal codes for a disjoint union of directed hypergraphs 



The size of an optimal code for a given directed hypergraph describing a broadcast network may be 
restated as a problem of determining the chromatic number of a graph, as observed by Bar-Yossef 
et al. for the Index Coding model [4]. Consider the block- length t = 1, and define the following: 

Definition 1 (Confusion graph). Let H = {[n], E) be a directed hypergraph describing a broadcast 
network. The confusion graph of H, ^{H), is the undirected graph on the vertex set {0, 1}", where 
two vertices x ^ y are adjacent iff for some e = {i, J) S E, xi ^ yi and yet Xj = yj for all j G J. 

In other words, 't{H) is the graph whose vertex set is all possible input-words, and two vertices 
are adjacent iff they are confusable, meaning they cannot be encoded by the same codeword for H 
(otherwise, the decoding of at least one of the receivers would be ambiguous). Hence, a code for 
H is equivalent to a legal vertex coloring of ^{H), where each color class corresponds to a distinct 
codeword. Consequently, if C is an optimal code for H, then \C\ = x(£(-ff)). 

Similarly, one can define <tt{H), the confusion graph corresponding to H with block- length t. 
From now on, throughout this section, the length t of the blocks considered will be 1. 

Proof of Theorem 1.1. The OR graph product is equivalent to the complement of the strong 
product^, which was thoroughly studied in the investigation of the Shannon capacity of a graph, a 
notoriously challenging graph parameter introduced by Shannon [18]. 

Definition 2 (OR graph product). The OR graph product of Gi and G2, denoted by G1VG2, 
is the graph on the vertex set V{Gi) x V{G2), where {u,v) and {u',v') are adjacent iff either 
uu' G E{Gi) or vv' G E{G2) (or both). Let G^^ denote the k-fold OR product of a graph G. 

Let Hi and H2 denote directed hypergraphs (as before) on the vertex-sets [m] and [n] respec- 
tively, and consider an encoding scheme for their disjoint union. Hi + H2. As there are no edges 
between Hi and H2, such a coding scheme cannot encode two input-words x,y G {0,1}™"''" by 
the same codeword iff this forms an ambiguity either with respect to Hi or with respect to H2 (or 
both). Hence: 

Observation 2.1. For any pair Hi,H2 of directed hypergraphs as above, the graphs <t{Hi + H2) 
and <t{Hi)\/ ^{H2) are isomorphic. 

Thus, the number of codewords in an optimal code for k ■ H is equal to xi^i^Y^)- The 
chromatic numbers of strong powers of a graph, as well as those of OR graph powers, have been 
studied intensively. In the former case, they correspond to the Witsenhausen rate of a graph (see 
[19]). In the latter case, the following was proved by McEliece and Posner [16], and also by Berge 
and Simonovits [5]: 



where Xf{G) is the fractional chromatic number of the graph G, defined as follows. A legal vertex 
coloring corresponds to an assignment of {0, l}-weights to independent-sets, such that every vertex 
will be "covered" by a total weight of at least 1. A fractional coloring is the relaxation of this 

■^Namely, the OR product of Gi and G2 is the complement of the strong product of Gi and 6*2. 




(3) 
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problem where the weights belong to [0, 1], and X/ is the minimum possible sum of weights in such 
a fractional coloring. 

To obtain an estimate on the rate of the convergence in (3), we will use the following well-known 
properties of the fractional chromatic number and OR graph products (cf. [2], [14], [12] and also [9]): 

(i) For any graph G, Xf{G^'') = Xf{G)^. 

(ii) For any graph G, Xf{G) < x{G) < [ Xf{G)\og\V{G)\ ]. [This is proved by selecting 
[ log |y(G)| ] independent sets, each chosen randomly and independently according 
to the weight distribution, dictated by the optimal weight-function achieving Xf- One can 
show that the expected number of uncovered vertices is less than 1.] 

(iii) For any vertex transitive graph G (that is, a graph whose automorphism group is transitive), 
Xf{G) = \V{G)\/a{G) (cf., e.g., [10]), where a{G) is the independence number of G. 

In order to translate (ii) to the statement of (2), notice that 7, as defined in Theorem 1.1 is precisely 
a(C(i?)), the independence number of the confusion graph. In addition, the graph ^{H) is indeed 
vertex transitive, as it is a Cayley graph of Zg. Combining the above facts, we obtain that: 



1/fc 2" 2" 



V ^ W a{^{H)) 7 ■ 

Plugging the above equation into (ii), while recalling that % (C(if)^^') is the size of the optimal 
code for k ■ H, completes the proof of the theorem. ■ 

Remark 2.2: The right-hand-side of (2) can be replaced by [1 + ^log7]. To see this, 

combine the simple fact that a(G^*'') = a{G)^ with the bound x{G) < [x/(G')(l -|- log a(G))] given 
in [14] (which can be proved by choosing [x/(G') log a(G)] independent sets randomly as before, 
leaving at most [x/(G)] uncovered vertices, to be covered separately). 



3 The possible gaps between the parameters j3* and /?i 

As noted in (1), /3 < /3* < In this section, we describe networks where the gap between any 
two of these parameters can be unbounded. The first construction is of a directed hypergraph on 
n vertices where (3 = 2 while (3* and [3i are both 0(logn), for any n = 2^. We then describe a 
more surprising, general construction which provides a family of directed hypergraphs, for which 
/3* < 3 while (3i = (log log n). Finally, we describe simple scenarios where even in the restricted 
Index Coding model, taking disjoint copies of the network can be encoded strictly better than 
concatenating the encodings of each of the copies. These constructions also apply to network 
coding, where for the latter ones it is also possible to prove a lower bound on the length of the 
optimal linear encoding scheme. 

Throughout this section, we use the following notations. For binary vectors u and let \u\ 
denote the Hamming weight of u, and let u®v be the bitwise xor of u and v. 
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3.1 Proof of Theorem 1.4 



Consider a scenario in which the input word consists of a block Xi for each nonzero element i G Zg, 
thus the number of blocks is n = 2*^ — 1. For any pair of distinct nonzero elements i,j G Zg, there 
exists a receiver which is interested in the block Xi and knows all other blocks except for xj. This 
scenario corresponds to a directed hypergraph H in which the vertices are the nonzero elements of 
Zg, and for every i,j G Z^ we have a directed edge (^,Z2 \ 

Let £ = ^{H) be the confusion graph of H for block-length t = 1. Since each receiver is missing 
precisely two blocks, for any pair of distinct codewords u,v £ {0, 1}", whenever |u © f | > 3, every 
receiver can distinguish between the codewords. On the other hand, if |u © i)| < 2, there is some 
receiver who may confuse his block in u and v. Thus, <t is exactly the Cayley graph of Z'2 whose 
generators are all elementary unit vectors e^, as well as all vectors © ej. 

Claim 3.1. The above defined graph (t satisfies x(^) ^ n + 1. 

Proof. Since n = 2'^ — 1 there is a Hamming code of length n, and the required coloring of any 
vector u is given by computing its syndrome. More explicitly, define c : V{(L) 1— > Z^ as follows. For 
a vector u = (ui, . . . , n„) G Zg , put c{u) := Yli£Zil-{o} " ^ • Let n, z; G Zg be a pair of adjacent 
vertices in C Thus 1 < |ii © f | < 2, and c{u) © c{v) is a sum (in Zg) of one or two distinct nonzero 
elements of Zg, which is not zero. Thus c(n) 7^ c{v), as needed. ■ 

Claim 3.2. The above defined graph C satisfies Xfi^) ^ n- + 1, and thus xi^) = X/(^) = n + 1. 

Proof. Recall that the clique number of the graph, namely the size of the largest clique in the 
graph, provides a lower bound on the fractional chromatic number. Thus it suffices to show that 
£ contains a clique of size n + 1. Define the vertex set S = {0} U {ej}"^]^ of size n + 1. For any 
u,v £ S, \u®v\ <2 and hence S induces a clique in £ completing the proof of the inequality. The 
equality follows from the previous claim. ■ 

Corollary 3.3. The parameters Pi{H),P*{H) satisfy f3*{H) = Pi{H) = log2(n + 1). 

Proof. Recall that Pi{H) = [log2 ? that in Theorem 1.1 we have actually shown that 

(3*{H) = log2 Xf{^{H)). The proof therefore follows from the fact that x(^) = X/(^) = n + \. ■ 

Let <tt = ^t{H) denote the confusion graph for block-length t. Thus, is the Cayley graph of 
Z2* whose generators are all vectors {(f^i, . . . .,Wn) | G Z2 , 1 < |{^ | it'j 7^ 0}| < 2}. In other 
words, two vertices are connected in the confusion graph if they differ at no more than 2 blocks. 

Claim 3.4. For 2* > n, x{'^t) = Xf{^t) = 2^*. 

Proof. For a lower bound, it suffices to show a set of size 2^* which is a clique in C^. Consider 
the vertex set {(tii,M2,0, . . . , 0) | «i,U2 G ^2} ™ which consists of 2^* vertices. Every pair of 
vertices in this set is connected in since they differ in at most two blocks, and therefore this is 
a clique in Cf. This shows that x(^t) ^ 2^*. 

To complete the proof, we describe a proper coloring of €. which uses 2^* colors, using a simple 
Reed-Solomon code. Let Qi,...,a„ be pairwise distinct elements in the finite field GF2t, and 
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define the coloring c : (GF2t)" GF2t x GF2t as follows. For a vector u = {ui, . . . ,Un), let 
c{u) := (X^iLi ' Sr=i '^i ' Clearly, if n, u E (GF2t)" differ in exactly one block then the first 
coordinate of c{u) and c{v) is different. Moreover, if u and v differ in exactly two blocks then 
either + uj 7^ u j + or aiUi + / ajfj + ajVj (or both inequalities hold), and again they 
will have different colors as needed. This shows that the coloring c is indeed proper and completes 
the proof of the claim. ■ 

Recalling that (3t{H) = [log2x(^t)l we obtain the following corollary, which together with 
Corollary 3.3 completes the proof of Theorem 1.4. 

Corollary 3.5. For the hypergraph H defined above, P{H) = limj^oo j ^og2 x{^t{H)) = 2. 



3.2 Proof of Theorem 1.3 

The basic ingredient of the construction is a Cayley graph G of an Abelian group K = {ki, . . . , kn} 
of size n, for which there is a large gap between the chromatic number and the fractional chromatic 
number. In our context, we use K = 7^2^ though such a Cayley graph of any Abelian group will do. 

Lemma 3.6. For any n = 2^ there exists an explicit Cayley graph G over the Abelian group 
for which x{G) > O.Ol^/Iogn and yet Xf{G) < 2.05. 

Proof. Let n = 2'^, and consider the graph G on the set of vertices as follows. For any i,j G Zg, 
the edge G G if \i (B j\ > k — We will now show that this graph has a large gap 

x{G)/xf{G) > n{Vk) = niVi^i). 

Claim 3.7. The chromatic number of G satisfies x{G) > + 2. 

Proof. The induced subgraph of G on the vertices {i £ Z2 \ \i\ = s} where s = ^ — is the 
Kneser graph K{k,s) whose chromatic number is precisely A; — 2s + 2, as proved in [13], using 
the Borsuk-Ulam Theorem. It is worth noting that one can give a slightly simpler, self-contained 
(topological) proof of this claim, based on the approach of [11]. ■ 

Claim 3.8. The fractional chromatic number of G satisfies Xf{G) < 2.05 

Proof. Since G is a Cayley graph, it is well known that Xf{G) = \V{G)\/a{G) (c.f. e.g. [17]), 
and therefore it suffices to show it contains an independent set of size at least ^05- Let / = {i G 
^2 I Nl < I ~ IMJ"' Obviously the set / is an independent set as the Hamming weight of i © j is 
below k — \/A;/100 for any i,j G /, and therefore (i,j) E{G). Hence, 

2^ 2^ 

XfiG) <jj- = jj- < 2.05 . ■ 



■ 2 200 



This completes the proof of Lemma 3.6. 
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Let H be the directed hypergraph on the vertices V = {1, 2, . . . ,n} defined as follows. For each 
pair of vertices i,j such that ki,kj are adjacent in G (i.e. ki — kj is a generator in the defining 
set of G), H contains the directed edges (i, V \ and {j, V \ As before, every receiver 

misses precisely two blocks. 

Let C = Ci{H) be the confusion graph of H (for block-length t = 1). Thus, C is the Cayley 
graph of Z2 whose generators are all vectors Cj, as well as all vectors e.j © ej so that ki,kj are 
adjacent in G. 

Claim 3.9. The chromatic number of C satisfies x{G) < x{^) ^ 3 • x{G)- 

Proof. That fact that x(G) < x{^) follows from the observation that the induced subgraph of C 
on the vertices Cj is precisely G (and hence, we similarly have Xf{G) < x(^))- 

It remains to prove that x(^) ^ '^Xf{G)- Let c be some optimal coloring of G with d = x{G) 
colors. Define a coloring of £ with colors as follows. For a vertex x = xi . . .Xn assign the 
color c'{x) = {\x\ mod 3, Y^ - Xi ■ c{i)) where the sum is in Z^. Clearly, c' uses 3d colors. It remains 
to show that this is indeed a legal coloring. Consider x, y which are adjacent in thus either 
X (By = Ci (=^> \x\ ^ \y\ mod 3) or x © y = Cj ® Cj. If |x| ^ |y| mod 3 they will have different colors, 
otherwise, a; © y = Cj © ej, \x\ = \y\ and there exists z £ Zd so that c'{x) = {\x\,z + c{i)) and 
c'{y) = {\y\, z + c{j)). Since x,y are adjacent, we know i,j are adjacent in G, thus c(i) 7^ c{j) which 
implies that c'{x) / c'{y) as required. 

Note that in the special case where G is a Cayley graph of the group Z^, the above upper bound 
on x(^) can be modified into the smallest power of 2 that is strictly larger than x{G). ■ 

Notice that in the above claim we did use any of the properties of G, hence they hold for any 
graph. This shows that regardless of the choice of G, for the confusion graph defined above, the 
gap between x and x/ is at most 3 times the corresponding gap in the original graph. 

Claim 3.10. The fractional chromatic number of C satisfies Xf{G) < X/(^) ^ 3 • XfiG). 

Proof. As mentioned above, the lower bound on X/(C) follows from the induced copy of G in 
and it remains to show that X/(C) < 3x/(G). 

Since C is a Cayley graph, it suffices to show it contains an independent set of size at least 
^■Xf(G) ' I C K he a maximum independent set in G. As G is a Cayley graph, Xf{G) = n/ll]. 
For a vector u = (ui, . . . , G Zg, define s{u) G by s{u) = X] • k^. For any j G K, put 

Ij = {n G Z5 I s{u)+j G /}. 

Let u,v he a pair of vertices so that u (B v = Ci ® Cj and \u\ = \v\. Hence s(n) = x + ki and 
s(y) = X + kj (here we rely on K being Abelian). If n, v both belong to Ii for some / G -ftT, it must 
be that ki and kj are not adjacent in G, and thus u and v are not adjacent in C 

It now follows that if u and v are vectors in Ij and \u\ = \v\ (mod 3), then u and v are not 
adjacent in £. Therefore, Ij is a union of three independent sets in C, and hence at least one of 
them is of size at least |/j|/3. This holds for every j G K. When j G is chosen randomly and 
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uniformly, then, by linearity of expectation, the expected number of elements in Ij is exactly 

I/I 2*" 

2" 



n Xf{G) 

Thus, there is some choice of j for which Ij is of size at least 2"/xj(G), and <t contains an 
independent set of size at least |/j |/3 = 2"'/3x/(G), as needed. ■ 

Corollary 3.11. For any Cayley graph G of an Ahelian group, there exists a confusion graph £ 
and some c G [^j 3] such that 

X(g) ^ X{G) 
X/(C) ""'xfiG)- 

Plugging the graph which is guaranteed by Lemma 3.6 into this construction completes the 
proof of Theorem 1.3. 

Remark 3.12: If we set G = Kn which is indeed a Cayley graph over an Abelian group, we get 
the example from Section 3.1. Some of the claims in this section generalize claims from Section 3.1. 



3.3 Applications to Index and Network Coding 

We are now considering the more restricted model where there is a single receiver which is interested 
in each block. In the directed hypergraph notation, this is equivalent to having precisely m = n 
directed edges where each directed edge has a different origin vertex. For easier notations, we can 
describe such a scenario (as done in [4], [15]) by a directed graph. Each directed edge (i, J) will be 
translated into | J| directed edges for all j G J. We can also consider an undirected graph for 
the case where the receiver who is interested in Xi knows Xj iff the receiver who is interested in Xj 
knows Xi. We use similar notations to the directed hypergraphs. 

Clearly, f3i{k ■ G) < k ■ Pi{G), as one can always obtain an index code for k ■ G hy taking the 
/c-fold concatenation of an optimal index code for G. Furthermore, it is not difficult to see that 
this bound is tight for all perfect graphs. Hence, the smallest graph where Pi{k ■ G) may possibly 
be smaller than k ■ Pi{G) is C5, the cycle on 5 vertices - the smallest non-perfect graph. Indeed, 
in this case index codes for k ■ C5 can be significantly better than those obtained by treating each 
copy of C5 separately. This is stated in Theorem 1.2 which we now prove. 

Proof of Theorem 1.2. One can verify that the following is a maximum independent set of size 
5 in the confusion graph (^(Cs): 

{00000, 01100, 00011, 11011, 11101} . 

In the formulation of Theorem 1.1, 7 = 5, and this theorem now implies that Pi{k ■ G^)/k tends to 
5 — log2 5 as ^ 00. On the other hand, one can verify^ that xi^iC^)) = 8, hence /3i(C5) = 3. ■ 

This shows that there is a graph G with an optimal index code C, so that much less than \C\^ 
words suffice to establish an index code for k ■ G, although each of the k copies of G has no side 
information on any of the bits corresponding to the remaining copies. 

''This fact can be verified by a computer assisted proof, as stated in [4]. 
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Remark 3.13: Using the upper bound of (2) in its alternate form, as stated in Remark 2.2, we 
obtain that (3i{k ■ C5) < k ■ PiiC^) already for k = 15. 

The example of C5 can be extended to other examples by looking at all the complement of odd 
cycles, i.e. Ck for any odd k > 5. All graphs in this family have a gap between the optimal code 
for disjoint union in comparison to the concatenation of the optimal code for a single copy. In 
Appendix A. 2 we prove the following properties of the complements of odd cycles: 

Claim 3.14. There exists a constant c> 1 so that for any n > 2, x(*^(C'2n+i)) > c- Xf{^{C2n+i))- 

Theorem 3.15. Let H2n+i = ([2n + 1],E), where for each i S [2n + 1] there is a directed edge 
{i, N-^^ 4.1(0) ^'^^ neighbors of i in C2n+i- Then any linear code for H2n+i 

requires 3 letters. 

Theorem 3.15 implies that a broadcast network based on the complement of any odd cycle has 
linear code of minimal length 3, regardless of the block length. Specifically for C23, we know that 
Xf{^{C 23)) < 4.809 (as can be seen in Appendix A. 3). Therefore, the above mentioned reduction 
to Network Coding provides us with an explicit network (of size 48) where the linear code must 
be of length at least 3 whereas the optimal code can be of length f3 < P* < log2 4.809 « 2.265, 
yielding a ratio of 1.324. This proves Corollary 1.5. ■ 

4 Conclusions and open problems 

• In this work, we have shown that for every broadcast network H with n blocks and m receivers, 
and for large values of k, /3^(-ff) = f3i{k ■ H) = {n — log2 a{(Li{H)) + o(l)) k, where the o(l)- 
term tends to as A; —> 00. For every large constant C there are examples H such that for 
large k, l3l{H)/k < 3 and yet Pi{H) > C. 

• Our results also imply that encoding the entire block at once can be strictly better than 
concatenating the optimal code for H with a single bit block. This justifies the definition of 
the broadcast rate of H, f3{H), as the optimal asymptotic average number of bits required 
per a single bit of coding in each block for H. 

• We have shown an infinite family of graphs (including the smallest possible non-perfect graph 
C5) for which there exists a constant c > 1 so that for each of these graphs there is a 
multiplicative gap of at least c between the chromatic number and the fractional chromatic 
number of their confusion graphs. However, the gap in all these graphs is below 2, and it is 
not known if for graphs this gap can be arbitrarily large. 

• Generalizing the above setting, allowing multiple users to request the same block, allows us 
to construct hypergraphs whose confusion graphs exhibit bigger gaps. 

— We have shown a specific family of confusion graphs where the fractional chromatic 
number is bounded (< 7) while the chromatic number is unbounded (r2(vTogn)). In 
these settings, a 1-bit block-length will require us to transmit 0(loglogn) bits while for 
large t-bit block-length, the required number of bits is linear in t. For other families. 
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this ratio can even reach G(logra). More surprisingly, for the first family, a network 
consisting of t independent copies of the original one will only require a number of bits 
that is linear in t. 

— With the generalized construction, one can build a hypergraph for any Cayley graph of 
an Abelian group for which the confusion graph maintains the same gap as the original 
graph. The maximum gap that can be obtained in this way is O(logn), since this is 
the maximum possible gap between the fractional and integer chromatic numbers of any 
n- vertex graph (c.f., e.g., [17]). 

• Currently, there is no known better upper bound for this gap which is specific for confusion 
graphs. The examples above are all exponentially far from the general upper bound 0(logy), 
which in our case equals to G(log2") = Q{n). 

• An interesting problem in Network Coding is that of deciding whether or not there are 
networks with an arbitrarily large gap between the optimal linear and non-linear flows. Note 
that the network is not allowed to depend on the size of the underlying fleld. Generalizing 
our constructions to create such examples could be interesting. 
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A Appendix 



A.l Proof of Theorem 3.15 

We first need to present some definitions and theorems from [4]. We say that a matrix A fits a 
graph G = {V,E) if A[i,i] 7^ for all i and A[iJ] = for i / j , ^ E (for (z, j) G E, A[i,j] 

is not limited). A generalization of a result in [4] (as noted in [15]) states that the length of the 
minimal linear encoding of G over a field F is always at least the minimal rank over F of a matrix 
A which fits G. It therefore suffices to show the following: 

Claim A.l. Let A he a matrix that fits the graph C2n+i over some field F. Then rank(yl) > 3. 
Proof. Let j4 be a matrix that fits C2n+i- This means that 

V« . A[i,i] / , 
V« G [2n] . A[i,i + 1] = , 
V« G [2n] . A[i + l,i] = , 
A[l,2n + l] =A[2n + l,l] =0 . 

Let A{t) denote the i'th row of A, when we look at it as a vector. Note that ^(1), ^(2) are linearly 
independent, as A\l, 1] 7^ but yl[2, 1] = and A[2, 2] 7^ 0. Assume, towards a contradiction, that 
rank(^) = 2 and hence for every t: A{t) = atA[l) + btA{2). We prove by induction on t that if t is 
odd then bt = 0, and if t is even then at = 0. Note that for any t it is impossible that at = bt = 0, as 
each line has a nonzero element. For t = 1,2 this is trivial. For some odd t = 2/c + 1, by assumption 
A{2k) = b2kA{2) and A{2k - 1) = 02^-1^(1). This means that 

A{2k + 1) = a2k+i/a2k-iA{2k - 1) + b2k+i/b2kA{2k) 

As A[2k - 1,2k] = A[2k + 1,2k] = but A[2k,2k] ^ 0, this means that b2t+i = 0, as required. 
A similar argument works for even t, which completes the proof of the induction. However, this 
leads to a contradiction, as A{2n + 1) = a2n+iA{l), and this is impossible as A[2n + 1,1] = 
but ^[1, 1] 7^ 0. Altogether, the assumption that rank{A) = 2 leads to a contradiction, hence 
rank(A) > 3. ■ 

Claim A.l completes the proof of Theorem 3.15. 
A. 2 Complements of odd cycles 

We showed that the cycle of length 5 is the smallest graph where there exists a gap between the 
fractional and integer chromatic numbers of its confusion graph. The cycle and its complement on 
5 vertices are isomorphic, however this is not the case for larger odd cycles and their complements. 
We now show that there is a gap between those numbers for any complement of an odd cycle of 5 
or more vertices. 

Throughout this section, let £2n+i = C^(C'2n+i)- 
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Claim A. 2. Any independent set A of ^2n+i can be extended to an independent set A' in C2n+3 
where \ A'\ = 4\A\. 

Proof. We first define a function / from tlie vertices of ^2n+i into sets of size 4 from the vertices 
of ^2n.+3 wliich satisfies tlie following: 

• For every vertex v of <t2n+i, f{v) is an independent set in ^2n+3- 

• If u and V are not adjacent in <t2n+i, then f{v) U f{u) is an independent set in <t2n+3- 

• fi'v) ^ /(^) = for any u / v. 

Given such / and an independent set A of C2n+i, define A' = [jy^j^fiv] which will be an 
independent set of size A\A\ in ^2^+3 as needed. We now describe this / explicitly and prove its 
properties: 



f{v = (Xi, X2, . . . , X2n, X2n+l)) 



(xi,X2, . . . ,X2„,X2„+1,0,0) =v' ®mo 
(xi,X2,...,X2„,X2n+l,0, 1) = v' B TUi 
(xT,X2, . . . ,X2„,X2n+l, 1,0) = v' B 1712 

(xT, a;2,...,a;2„,X2n+i,l,l) = v' e ms 

where v' is v extended to length 2n+3 with two additional zeros at the right end and mo, mi,m2, ms 
are 4 appropriate constant binary vectors of length 2n + 3. 

• f{v) is an independent set: Since £2n+3 is a Cayley graph over Zg""*"^, we only need to show 
that the sums of all pairs of the 4 vectors in f{v) are not generators in our graph. All 
these sums are in {mi,m2,m3} since mo = and mi © m2 © = (notice that these 
sums are independent of the choice of v). Since we are looking at the confusion graph of 
the complement of an odd cycle, the generators of this graph are vectors of hamming weight 
1,2 and 3 of consecutive ones (i.e. e^, Cj © ej+i and Cj © Cj+i © ej+2 for any i where the 
indices are reduced modulo 2n + 3). Indeed {mi,m2,m3} are not generators, therefore f{v) 
is independent. 

• f{v) U f{u) is an independent set when u and v are not adjacent: Consider x = v' ® mi and 
y = u' (B ruj for some i,j (where u' and v' are as before). We want to show that x © y is not 
a generator. If i = j then x ®y = u' ®v' and therefore it is not a generator (two additional 
zero bits at the right will not turn a non-generator into a generator). Assume now i ^ j, then 
we know x © y = (u' © f') © m^ = (it © v)' © m^ for some k G {1, 2, 3}. Since u® v \s not a 
generator, with claim A. 3 we get that x® y is not a generator as needed. 

• For f / ti, f{v) n f{u) = 0: Let us assume there exists some x G f{v) H f{u). Since all 
vectors in both f{v) and f{u) differ at their two right most bits {x2n+2, 3;2n+3}, there exists 
i G {0, 1, 2, 3} so that x = i;' © m^ = © m^ in contradiction to v ^ u. ■ 

Claim A. 3. If x of length 2n + l is not a generator, then x' ®mk for k £ {1,2,3} is not a generator 
as well (where x' is x extended with two zero bits on the right as before). 
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Proof. The cases of k = 1 and k = 2 are symmetric so we will consider only k = 1 and A; = 3. We 
show explicitly what x can be in order for the result to be a generator, and as all such x vectors 
turn out to be generators the desired result follows. Since x' is the extension of x with two zero 
bits on the right, it cannot affect the two rightmost bits of x' © ruk- 

2n-l 

• k = 1: mi = 0... 0101 so in order to make it a generator, we must flip the 2n + 1 
bit and then we can at most flip 0,1 or 2 consecutive ones at the left most side. Hence, 
X G {OOO^^^^^^l, lOO^^^^^^l, 11 O^^^^^l} which are all generators of length 2n + 1. 

2n-2 2n-2 2n-2 

2n-l 

• k = 3: = 1 ... 111 so in order to make it a generator, we must flip at least one of the 
bits at locations {1, 2n + 1} and no other bit. Thus, 

X G {00_^^^^1, 10_^^^^0, 10_^^^^1} which are all generators of length 2n + 1. ■ 

2n-l 2n-l 2n-l 

Claim A. 4. The fractional chromatic number of ^2n+i is monotone decreasing with n, that is, 

X/(^2n+3) < X/(^2n+l)- 

Proof. Since G is a Cayley graph, we know Xf{^2n+i) = 2^"'"''^/a(£2n+i)- Using the previous claim 
we know a(£2n+3) > 4a(£2n+i) and therefore 

X/(e:2„+3) = 22"+3/a((j:2n+3) < 22"+Va((j:2n+l) . ■ 

Corollary A.5. For any n>8, Xfi^2n+i) < 4.99 ( < 5). 

Proof. By a computer search (see Appendix A. 3) one can see that the fractional chromatic number 
of the confusion graph of the complement of an odd cycle on 17 vertices is below 5. Since this 
property is monotone, this holds for any n > 8. ■ 

Proof of Claim 3.14. The authors of [4] showed that the chromatic number of the confusion 
graph of any complement of an odd cycle is strictly bigger than 4 and is at most 8. We have shown 
that the fractional chromatic number of the confusion graph of a cycle on 5 vertices is 32/5=6.4 and 
we just proved it is monotone decreasing. Since the number of vertices in these confusion graphs 
is a power of 2, the fractional chromatic number cannot be an integer between 4 and 8 so it will 
always be smaller than the integer chromatic number. 

We know that for any n > 8 there is a gap of at least b/xfi^ir)- For smaller values of n there 
is some fixed gap exceeding 1, so taking the minimum between these gaps will give a single c for 
which the claim holds. ■ 

The limit lim„_»oo Xfi'^2n+i), which exists by monotonicity, remains unknown at this time. 
Claim A. 6. The chromatic number o/£2n+i is monotone decreasing with n: x(^2ri+3) ^ x(^2n+i)- 

Proof. A coloring of ^2n+i with k colors is a partition of the graph into k independent sets. We 
can obtain a coloring of C2n+3 with the same number of colors by applying the extension described 
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before on each of the independent sets. We already proved the new sets would be independent 
sets in the new graph. We need to prove that all the vertices of the graph belong to one of the 
independent sets. This can be shown by noticing the size of the new sets is 4 times their previous 
size and since they have empty intersection (as we have seen before) they must cover the entire 
graph (as the size of the graph is precisely 4 times that of the previous one) . ■ 

Corollary A. 7. For any n > 3, 5 < x{^2n+i) < 7. 

Proof. By a computer search (see Appendix A. 4) one can see that for 7 vertices, the integer chro- 
matic number is at most 7. By monotonicity and the fact that is must be at least 5 (as shown in 
[4]) the desired result follows. ■ 

As in the fractional case, the limit lim^^oo x{^2n+i) exists and can only be 5,6 or 7, however it 
remains unknown at this time. 

A. 3 Fractional chromatic number upper bounds for C (C„) 

Here is a table of upper bounds for the fractional chromatic number of the confusion graphs of the 
complements of odd cycles. These upper bounds were found by searching for a large independent 
set in each of these graphs as they are Cay ley graphs. This search was done by a computer program 
which does not assure us for the optimal result, hence it only provides the bounds stated in the 
table, which are not necessarily tight. 



n 


a(^ {Cn)) 




5 


5 


2^5 = 6.4 


7 


> 22 


< 2'^/22 « 5.818 


9 


> 93 


< 2^/93 ^ 5.505 


11 


> 386 


< 2^7386 ^ 5.306 


13 


> 1586 


< 2^3/1586 PS 5.165 


15 


> 6476 


< 2^76476 « 5.060 


17 


> 26317 


< 2^'^/26317 ^ 4.981 


19 


> 106744 


< 2^9/106744 ^ 4.912 


21 


> 430592 


< 2^7430592 ^ 4.870 


23 


> 1744414 


< 2^3/1744414 ^ 4.809 



Although the bounds are not necessarily tight, they clearly suggest a monotone behavior of the 
fractional chromatic numbers of these graphs. 

The computer program which found most of these sets and a program that verifies an indepen- 
dent set (of a specific form) can be found at www.math.tau. ac . il/""amitw/broadcasting. 

A. 4 Coloring the confusion graph of Cj with 7 colors 

It was proved in [4] that the index coding for any complement of an odd cycle is precisely 3, however, 
the minimum number of codewords can vary between 5 and 8. Here we show a legal coloring using 
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7 colors for n = 7, which was found using a computer program. Each cell in the following table 
represent a vertex out of the 128 vertices in the graph which are {0, 1, . . . 127}. The vertex is the 
sum of its two indices in the table, e.g. the bolded vertex in the table is 16 + 4 which is colored 
with the seventh color. 








1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


13 


14 


15 





1 


2 


3 


4 


2 


7 


6 


5 


3 


4 


7 


2 


4 


1 


5 


7 


16 


2 


7 


4 


3 


7 


5 


1 


4 


4 


3 


2 


1 


5 


6 


7 


2 


32 


3 


4 


7 


2 


4 


1 


5 


7 


1 


2 


3 


4 


2 


7 


6 


5 


48 


4 


3 


2 


1 


5 


6 


7 


2 


2 


7 


4 


3 


7 


5 


1 


4 


64 


5 


7 


1 


6 


3 


5 


2 


4 


7 


6 


5 


1 


6 


3 


4 


2 


80 


1 


5 


6 


7 


2 


4 


3 


6 


6 


1 


7 


5 


4 


2 


5 


3 


96 


7 


6 


5 


1 


6 


3 


4 


2 


5 


7 


1 


6 


3 


5 


2 


4 


112 


6 


1 


7 


5 


4 


2 


5 


3 


1 


5 


6 


7 


2 


4 


3 


6 



A computer program that verifies this is indeed a legal coloring can also be found at 
www.math.tau. ac . il/~amitw/broadcasting. 



19 



